For this tutorial, we will analyze Erik Meyersson’s 2014 paper, “Islamic Rule and the Empowerment of the Poor and Pious.” Meyersson asks whether political control by Islamic religious political parties leads to a decrease in women’s rights, particularly female education rates.

Meyersson looks at Turkey in 1994, where an Islamic party gained political control in many municipalities, and a number of the election results were very close. He uses a regression discontinuity analysis of Islamic control on the rate of secondary school completion by girls, focusing on the Local Average Treatment Effect in areas where Islamic parties barely won or lost their elections.

You’ll need to install the ‘rdd’ package for this tutorial. You can find the dataset here, it contains the variables shown on the right hand side:

  1. Suppose Meyersson found that women’s level of education is associated with the religious affiliation of the party in control. But such a finding would not suffice to conclude the nature of the causal association. Why is that? What alternative explanations might explain this association? How would we decide which explanation is the correct one? Find the place in Meyersson’s article that addresses this point, and discuss.
Codebook for Meyersson’s (2014) paper
Variable Description
iwm94 running/treatment variable: margin of Islamic party win or loss in 1994, pp: 0 indicates an exact tie, A margin of greater than zero means the Islamic party won
hischshr1520m outcome: secondary school completion rates for ages 15-20 males
hischshr1520f outcome: secondary school completion rates for ages 15-20 females
lpop1994 log of the locality population in 1994
sexr sex ratio in locality
lareapre log of locality area
  1. Calculate the difference in means in secondary school completion rates for females and males, comparing regions where Islamic parties won and lost in 1994. This is equivalent to the SDO, the simple difference in means \(\mathbb{E}\big[Y|T=1\big]-\mathbb{E}\big[Y|T=0\big]\). Do you think this is a credible estimate of the causal effect of Islamic party control? Why or why not? (Create a treatment variable, islamicwin, that indicates whether or not the Islamic party won the 1994 election. Then use the option na.rm=T to ignore missing data.)

  2. Now we’ll start regression discontinuity analysis. First, select optimal bandwidths for testing female high school completion rates using the Imbens-Kalyanaram procedure. For this, you will need the IKbandwidth function from the rdd package. Read the help file to see what the function requires.

    • Explain what the bandwidth means in this case. Using this band width, create a new dataset containing only data within the optimal bandwidth. Notice that if the cutoff point is \(C_o\) optimal bandwidth is \(h\), you want to include regions where the running variable \(X_{running}\in[C_o-h, C_o+h]\)


  1. create a new dataset containing only data within the optimal bandwidth. Then find the Local Average Treatment Effect of Islamic party control on women’s secondary school education at the threshold, using the dataset you created in (d) and a simple linear regression that includes the treatment and running variable. How credible do you think this result is?

  2. Use RD estimation to find the Local Average Treatment Effect of Islamic party control on men’s and women’s secondary school education at the threshold, using local linear regression estimated with the RDestimate function from the rdd package. Does the estimate differ from your previous estimates? Your code should be of the form RDestimate(___~___, cutpoint=___, bw=___, data=___)

  3. Plot the relationship between the running variable and outcome using local linear regressions. Use your plot to explain why your results do or do not differ strongly compared to your previous results.

    • Code Hints: You can just use the plot() function directly on the object you created in the previous question. Use the range argument to control the x axis


  1. Perform placebo tests to check that the relationship between the running variable and out- come is not fundamentally discontinuous, by estimating RD estimates at placebo cutoffs of -0.1, -0.05, 0.05 and 0.1. What do you conclude? To run placebo cutoffs, use the RDestimate as before and but use a different cutoff.

  2. Perform a robustness check for local randomisation at the threshold by performing RD estimates in the same way as in question (6) for the three background covariates sexr, lop1994 and lareapre. What do you conclude?

  3. Perform a McCrary test: another way to check for sorting at the theshold. Plot and interpret the results. Code Hints: Use the DCdensity function in the rdd package with the option verbose=TRUE

  4. Bonus: Examine the sensitivity of the main RD result to the choice of bandwidth by calculating and plotting RD estimates and their associated 95% confidence intervals for a range of bandwidths from 0.05 to 0.6. To what extent do the results depend on the choice of bandwidth?

    Hints: Begin by creating a vector of thresholds such as thresholds <- seq(from=0.05,to=0.6, by=0.005). Then use a for loop. You can extract the estimate and standard error from an RD estimate named rdest with the code rdest$est[1] and rdest$se[1]